Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require spec.version to be set in Kiali CR and OSSMConsole CR. #818

Closed

Conversation

jmazzitelli
Copy link
Contributor

fixes: kiali/kiali#7741

This fix goes across all supported versions. There is nothing to backport to different version roles. We can backport this to the v1.89 branch, though.

nrfox
nrfox previously requested changes Sep 12, 2024
Copy link
Contributor

@nrfox nrfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat -1 on this change. If you can specify default to get the latest version of the CR that the operator supports, shouldn't that just be the actual default?

@jmazzitelli
Copy link
Contributor Author

jmazzitelli commented Sep 12, 2024

To test the behavior on minikube:

  1. Install minikube, Istio: hack/k8s-minikube.sh start && hack/k8s-minikube.sh istio
  2. Build, push, deploy everything: make -e CLUSTER_TYPE=minikube cluster-push operator-create kiali-create
  3. At this point, Kiali should be installed. But now edit the Kiali CR (kubectl edit kiali kiali -n kiali-operator) and remove the spec.version field. After that, look at the status field of the Kiali CR:
kubectl get kiali kiali -n kiali-operator -oyaml

You should see:

    lastTransitionTime: "2024-09-12T18:43:18Z"
    message: |
      The CR is missing the spec.version field.
      You must explicitly indicate which version you want.
      The spec.version field must be either the literal string 'default' or
      one of the supported versions in the form of 'vX.Y'.
      Please consult the Kiali documentation for more details.
    reason: Failed
    status: "True"
    type: Failure
  1. Edit the Kiali CR again and set a specific spec.version (say, v1.89) and see Kiali is able to be deployed.

@jmazzitelli
Copy link
Contributor Author

If you can specify default to get the latest version of the CR that the operator supports, shouldn't that just be the actual default?

The whole idea here is to require the user to be explicit - they need to tell us what version they want. Leaving it blank is ambiguous - what do they want? We do not know explicitly. And I don't think they know :) Forcing them to put a value here will prod them into looking at the docs to determine what version they need/want.

We are simply trying to avoid the problems we have been hitting today with the odd way 1.89 was released but not supported on currently released OSSM releases. That's one edge case. There are others that will happen in the future (OSSM 3.0 and 3.1). In short, different versions of Kiali will be needed to support different control plane versions, which could be different than the Kiali Operator version. We want the user to be very explicit and intentional - they need to tell us what version they want.

@jmazzitelli
Copy link
Contributor Author

To test on OpenShift (and test OSSMConsole CR):

  1. Install CRC, Istio and build/push dev images:
hack/crc-openshift.sh start -p [your pull secret path] \
 && oc login -u kubeadmin -p kiali https://api.crc.testing:6443 --insecure-skip-tls-verify \
 && podman login --tls-verify=false -u kiali -p $(oc whoami -t) default-route-openshift-image-registry.apps-crc.testing \
 && hack/istio/install-istio-via-istioctl.sh -cp openshift \
 && make build build-ui cluster-push
  1. In the OSSMC local git repo, make cluster-push so you build and push the OSSMC image
  2. Deploy the operator, kiali, and ossmc: make operator-create kiali-create ossmconsole-create
  3. Edit the OSSMConsole CR and remove its spec.version field: oc edit ossmconsoles.kiali.io ossmconsole -n ossmconsole
  4. Look at the OSSMConsole CR and see the error in the status field
  5. Edit the OSSMConsole CR again, this time set spec.version to v1.89 - at this point, the operator logs should show you it is attempting to remove/uninstall any existing OSSMC using the default ansible role (which is what it used when it installed OSSMC in step 3). It then installs 1.89. (NOTE: this is eventually going to fail because your server is 1.90-SNAPSHOT so the operator will abort because OSSMC needs a server which is the same version as itself; but ignore this. If you got this far, the test for this PR passes).

@jshaughn jshaughn added the requires docs PR Requires kiali.io or other documentation updates label Sep 13, 2024
@nrfox
Copy link
Contributor

nrfox commented Sep 16, 2024

Leaving it blank is ambiguous - what do they want?

I don't think it's ambiguous. You get the latest version that the operator supports. And IMO that's the expected behavior and a sane default. It's partly why we ensure the latest version of Kiali supports the latest version of Istio.

This change seems more a byproduct of how the operator is released and managed on OLM which isn't relevant to non-OLM users. That being said, even though it will make it harder for new users to deploy Kiali, maybe it will save some headache for users running an older version of Istio with a newer version of the operator.

@nrfox nrfox dismissed their stale review September 16, 2024 12:27

See above latest comment

Copy link
Contributor

@nrfox nrfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last thing I'll add about this change is that users will still be able to create Kiali CRs without the version field set because there is no validation on the Kiali CRD. Only now they will have to notice that the Kiali deployment is not being created after some time and then do kubectl get kialis ... -o yaml to see the ReconcileError that explains they need to set spec.version explicitly.

@jmazzitelli
Copy link
Contributor Author

then do kubectl get kialis ... -o yaml to see the ReconcileError that explains they need to set spec.version explicitly.

Right. We are not doing any magic here - all we are trying to avoid is people getting errors that provide no clue as to what the actual problem is. This forces people to then create a github issue or discussion, or create a customer support ticket, and having to wait for an answer. This change allows them to fix it themselves without having to ask for support. We are trying to avoid wasting developer/customer support time, and wasting the user's time having to try to figure out what the problem is.

Yes, it is possible they don't even bother looking at the status field of the CR, but this at least gives them a chance to know how to fix themselves if they do just a basic check of the Kiali CR.

@ferhoyos
Copy link
Collaborator

In this case, I agree with @nrfox. If I leave it empty, I expect the same Kiali version to be installed as the operator. Even though spec.version is required, I think people will set the default value (the easiest option) and face the same issues. If I were to require the field, I would force the user to specify a version (e.g., v1.89, v1.73, etc.). I'm not sure if it’s possible for the operator to automatically populate this value in the Kiali CR with the operator's version by default (instead of using the default value)

@jmazzitelli
Copy link
Contributor Author

I'm not sure if it’s possible for the operator to automatically populate this value

Having Kiali automatically populate the value doesn't solve the problem - its the same issue. WHAT do we think the user wanted to do? We could automatically populate the value, but the question remains - what does the user want?

In fact, we do effectively auto-populate it - its just "default" - that IS the default :)

@jmazzitelli
Copy link
Contributor Author

If I leave it empty, I expect the same Kiali version to be installed as the operator.

And that's fine - that's how it works.

Then why are people breaking, and more importantly, not understanding WHY they are breaking, if they are getting what they expect? :)

@jshaughn
Copy link
Contributor

This is a borderline call. Although I understand the reasons against, I'm in favor of the change. I don't feel like it's too much to ask to have users be explicit about their versioning when they are already going to be defining other things in the CR. It will help them avoid foot-gunning themselves. I don't love that if the installation stops that they then need to look at the operator logs to see why, but at least you don't end up with wonky behavior, and we can say, "check the logs". The fact that the operator is "one ring to rule them all", able to install multiple versions, is what makes this a useful change. When installing with Helm Charts the user is doing explicit versioning by choosing the HC version they want, to me this is somewhat analogous.

If we don't make the change I think we should minimally have the the Operator log that the value was unset and we are assuming spec.version: default, and installing the latest version. That woukd have to go hand-in-hand with a FAQ that we can point to when they mess up.

Since this is a +2, -2 situation, we may need a maintainer vote to resolve this.

@jmazzitelli
Copy link
Contributor Author

jmazzitelli commented Sep 17, 2024

I don't love that if the installation stops that they then need to look at the operator logs to see why

You don't have to look in the logs...just look at the status field of the Kiali CR and you'll see the error message telling them what to do. For what it looks like, see #818 (comment)

@nrfox
Copy link
Contributor

nrfox commented Sep 18, 2024

You don't have to look in the logs...just look at the status field of the Kiali CR and you'll see the error message telling them what to do.

I'd be more in favor of this change if we could make the version field on the CRD required so that you are unable to actually create a Kiali CR without setting this field. But because the Kiali CRD is unstructured, we cannot do this. Personally, I think it's just as confusing to be able to create a Kiali CR only to have it fail to reconcile and then need to look at the status to know that you have to set a certain field.

What if we kept the default behavior of getting the latest version the operator supports and we updated all the docs to include an explicit version field? That way if you are just following along with the documentation, you will set the version field explicitly. And if you happen to create a Kiali CR without the version field set, it will still reconcile.

@jmazzitelli
Copy link
Contributor Author

and we updated all the docs to include an explicit version field?

We are going to want to do that anyway - regardless of the decision we make here. I think ALL our docs showing a Kiali CR should show the spec.version field.

@jshaughn
Copy link
Contributor

jshaughn commented Sep 19, 2024

OK, we have a 2-2 tie for how to handle this issue. I'm asking the remaining maintainers and active testers, to vote or actively abstain if you don't feel strongly either way. Voting starts now and ends on Sep 26, 10am ET, or until we have a clear majority. If we end in a tie the lead maintainer will decide (me).

for: spec.version is required
against: spec.version is optional
X: abstain

For
@jmazzitelli
@jshaughn

Against
@ferhoyos
@nrfox
@hhovsepy
@leandroberetta

Abstain
@josunect

No Vote
@aljesusg
@ScriptingShrimp
@Xunzhuo

@josunect
Copy link
Contributor

I'll abstain.

I understand some of the errors in the logs when Istio is not compatible are very difficult to troubleshoot, and I'm agree this PR would avoid this situation, but I’m not sure that specifying the version will prevent users from asking, as the reason for the CR not starting may not be clear. I would expect not having to specify the version, I would expect that the latest compatible version will be installed.

IMO, the ideal situation would be to do an Istio compatibility check and log an error to warn the user if the versions are not compatible - avoiding these log errors Incomprehensible for the user -, but I understand this can be really hard to maintain.

@leandroberetta
Copy link
Contributor

leandroberetta commented Sep 27, 2024

Sorry for the late response, my vote is to use a default (latest) if there is no version set.

@hhovsepy
Copy link
Contributor

My vote is for having optional value and using the default latest one when not set.

@jshaughn
Copy link
Contributor

OK, thanks to all that voted, it looks like this change is unpopular (4-2-1 against), and so we won't require spec.version. We need to deal with this via docs and whatever we can provide in the logs (a la 'spec.version unset, assuming default and installing kiali server <version>, without actually stopping the install.

@jmazzitelli
Copy link
Contributor Author

I closed my PRs.

I am sure all of those who were against this will be diligently watching Slack/JIRA/github for those people that will have this "missing spec.version" problem, and they will promptly provide those persons the support they will need to fix the problem. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
requires docs PR Requires kiali.io or other documentation updates requires helm chart PR
Projects
Development

Successfully merging this pull request may close these issues.

make spec.version required in Kiali CR and OSSMConsole CR
7 participants